We present a simple and effective approach to incorporating syntacticstructure into neural attention-based encoder-decoder models for machinetranslation. We rely on graph-convolutional networks (GCNs), a recent class ofneural networks developed for modeling graph-structured data. Our GCNs usepredicted syntactic dependency trees of source sentences to producerepresentations of words (i.e. hidden states of the encoder) that are sensitiveto their syntactic neighborhoods. GCNs take word representations as input andproduce word representations as output, so they can easily be incorporated aslayers into standard encoders (e.g., on top of bidirectional RNNs orconvolutional neural networks). We evaluate their effectiveness withEnglish-German and English-Czech translation experiments for different types ofencoders and observe substantial improvements over their syntax-agnosticversions in all the considered setups.
展开▼